Search CORE

arXiv.org e-Print Archive

Towards Computing Inferences from English News Headlines

Author: A Kronrod
A Pilkington
D Dor
E Iarovici
G Yule
H Paul Grice
I Dagan
M-C de Marneffe
SC Levinson
V Fromkin
V Pekar
Publication venue
Publication date: 18/10/2019
Field of study

Newspapers are a popular form of written discourse, read by many people, thanks to the novelty of the information provided by the news content in it. A headline is the most widely read part of any newspaper due to its appearance in a bigger font and sometimes in colour print. In this paper, we suggest and implement a method for computing inferences from English news headlines, excluding the information from the context in which the headlines appear. This method attempts to generate the possible assumptions a reader formulates in mind upon reading a fresh headline. The generated inferences could be useful for assessing the impact of the news headline on readers including children. The understandability of the current state of social affairs depends greatly on the assimilation of the headlines. As the inferences that are independent of the context depend mainly on the syntax of the headline, dependency trees of headlines are used in this approach, to find the syntactical structure of the headlines and to compute inferences out of them.Comment: PACLING 2019 Long paper, 15 page

Repositorium für Naturwissenschaften und Technik

Reactive plasma cleaning and restoration of transition metal dichalcogenide monolayers

Author: Arutchelvan Goutham
Asselberghs Inge
Bal Kristof M.
Banerjee Sreetama
De Gendt Stefan
de Marneffe Jean-François
El Kazzi Salim
Heyne Markus Hartmut
Lin Dennis
Mankelevich Yuri
Marinov Daniil
Nalin Mehta Ankit
Neyts Erik C.
Rakhimova Tatyana
Smets Quentin
Voronina Ekaterina
With Patrick C.
Wyndaele Pieter-Jan
Zhang Jianran
Publication venue: London : Nature Publishing Group
Publication date: 01/01/2021
Field of study

The cleaning of two-dimensional (2D) materials is an essential step in the fabrication of future devices, leveraging their unique physical, optical, and chemical properties. Part of these emerging 2D materials are transition metal dichalcogenides (TMDs). So far there is limited understanding of the cleaning of “monolayer” TMD materials. In this study, we report on the use of downstream H2 plasma to clean the surface of monolayer WS2 grown by MOCVD. We demonstrate that high-temperature processing is essential, allowing to maximize the removal rate of polymers and to mitigate damage caused to the WS2 in the form of sulfur vacancies. We show that low temperature in situ carbonyl sulfide (OCS) soak is an efficient way to resulfurize the material, besides high-temperature H2S annealing. The cleaning processes and mechanisms elucidated in this work are tested on back-gated field-effect transistors, confirming that transport properties of WS2 devices can be maintained by the combination of H2 plasma cleaning and OCS restoration. The low-damage plasma cleaning based on H2 and OCS is very reproducible, fast (completed in a few minutes) and uses a 300 mm industrial plasma etch system qualified for standard semiconductor pilot production. This process is, therefore, expected to enable the industrial scale-up of 2D-based devices, co-integrated with silicon technology

Semantically linking molecular entities in literature through entity relationships

Author: A Airola
A Reverter
Bernard De Baets
C Burgess
D Jurgens
D McClosky
DLT Rohde
E Charniak
EW Sayers
H Kilicoglu
I Tsochantaridis
J Björne
J Björne
J Björne
J Björne
Jari Björne
JD Kim
JD Kim
JD Kim
M Buckland
M de Marneffe
M de Marneffe
M Krallinger
M Miwa
M Sahlgren
MF Porter
R Leaman
S Pyysalo
S Pyysalo
S Pyysalo
S van Dongen
S Van Landeghem
S Van Landeghem
S Van Landeghem
S Van Landeghem
S Van Landeghem
S Van Landeghem
Sofie Van Landeghem
T Ohta
Tapio Salakoski
The UniProt Consortium
Thomas Abeel
TK Landauer
VN Vapnik
Yves Van de Peer
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Background Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molecular relations attribute to this goal by formally identifying the relations between genes, promoters, complexes and various other molecular entities found in text. More importantly, these studies help to enhance integration of text mining results with database facts. Results We describe, compare and evaluate two frameworks developed for the prediction of non-causal or 'entity' relations (REL) between gene symbols and domain terms. For the corresponding REL challenge of the BioNLP Shared Task of 2011, these systems ranked first (57.7% F-score) and second (41.6% F-score). In this paper, we investigate the performance discrepancy of 16 percentage points by benchmarking on a related and more extensive dataset, analysing the contribution of both the term detection and relation extraction modules. We further construct a hybrid system combining the two frameworks and experiment with intersection and union combinations, achieving respectively high-precision and high-recall results. Finally, we highlight extremely high-performance results (F-score > 90%) obtained for the specific subclass of embedded entity relations that are essential for integrating text mining predictions with database facts. Conclusions The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed. The recent release of the EVEX dataset, containing biomolecular event predictions for millions of PubMed articles, is an interesting and exciting opportunity to overlay these entity relations with event predictions on a literature-wide scale

TU Delft Repository

Ghent University Academic Bibliography

Archivsystem Ask23

Learning perceptually grounded word meanings from unaligned parallel data

Author: A. Vogel
C. A. Thompson
C. Matuszek
C. Matuszek
D. L. Chen
H. Poon
J. Clarke
J. Dzifcak
Joshua Joseph
K. Hsiao
L. S. Zettlemoyer
M. MacMahon
M. Marneffe de
M. Skubic
N. Mavridis
Nicholas Roy
P. Liang
P. Rybski
Pratiksha Thaker
R. S. Jackendoff
S. Chernova
S. Piantadosi
S. R. K. Branavan
S. Tellex
S. Tellex
Stefanie Tellex
T. Kollar
T. Kwiatkowski
Y. Wong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2012
Field of study

In order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. Previous approaches to learning these grounded meaning representations require detailed annotations at training time. In this paper, we present an approach to grounded language acquisition which is capable of jointly learning a policy for following natural language commands such as “Pick up the tire pallet,” as well as a mapping between specific phrases in the language and aspects of the external world; for example the mapping between the words “the tire pallet” and a specific object in the environment. Our approach assumes a parametric form for the policy that the robot uses to choose actions in response to a natural language command that factors based on the structure of the language. We use a gradient method to optimize model parameters. Our evaluation demonstrates the effectiveness of the model on a corpus of commands given to a robotic forklift by untrained users.U.S. Army Research Laboratory (Collaborative Technology Alliance Program, Cooperative Agreement W911NF-10-2-0016)United States. Office of Naval Research (MURIs N00014-07-1-0749)United States. Army Research Office (MURI N00014-11-1-0688)United States. Defense Advanced Research Projects Agency (DARPA BOLT program under contract HR0011-11-2-0008

DSpace@MIT

Large-scale directional relationship extraction and resolution

Author: A Culotta
A Gladki
A Koike
A Yuryev
AB Clegg
C Rodriguez-Penagos
CM Topinka
Cory B Giles
D Zhou
F Rinaldi
F Rinaldi
H Chen
H Jang
H Kim
I Donaldson
IK Ruf
J Ding
J Jiang
JA Mitchell
JC Park
JD Kim
JD Kim
JD Wren
JD Wren
JD Wren
Jonathan D Wren
JP Vaque
K Fundel
K Sagae
LM Juliano
M Bundschus
M Chagoyen
M Huang
M Lease
M Wang
M-C de Marneffe
N Daraselia
P Zweigenbaum
R Bunescu
R Kuffner
RC Bunescu
RT Tsai
S Kim
S Novichkova
TK Jenssen
W Pratt
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning

Author: A Airola
A Yakushiji
AB Clegg
Antti Airola
AP Bradley
C Giuliano
C Nédellec
CD Meyer
D Zelenko
Filip Ginter
J Björne
J Ding
J Heimonen
JA Hanley
JAK Suykens
Jari Björne
JD Kim
JG Caporaso
K Fundel
KB Cohen
L Hirschman
L Hunter
M Lease
M Miwa
MC de Marneffe
P Zweigenbaum
R Bunescu
R Bunescu
R Bunescu
R Rifkin
R Sætre
S Pyysalo
S Pyysalo
S Pyysalo
S Van Landeghem
Sampo Pyysalo
T Gärtner
T Mitsumori
T Pahikkala
T Pahikkala
Tapio Pahikkala
Tapio Salakoski
Y Miyao
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Benchmarking natural-language parsers for biological applications using dependency graphs

Author: A Bies
AB Clegg
Adrian J Shepherd
Andrew B Clegg
B Rosario
B Srinivas
C Friedman
C Grover
C Grover
D Blaheta
D Gildea
D Klein
D Klein
D Lin
D Lin
D Sleator
DM Bikel
E Charniak
E Tsivtsivadze
EB Camon
EJ Briscoe
G Sampson
G Schneider
G Schneider
IM Goldin
J Carroll
J Carroll
J Finkel
J Xiao
JM Temkin
K Franzén
K Knight
KB Cohen
L Smith
M Collins
M Lease
MC de Marneffe
MP Marcus
N Domedel-Puig
N Ge
O Sanchez
P Merlo
PG Mutalik
S Abney
S Kübler
S Pyysalo
ST Ahmed
T Briscoe
TC Rindflesch
Y Huang
Z Shi
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. RESULTS: Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. CONCLUSION: Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques

Directory of Open Access Journals

Methods to actively modify the dynamic response of cm-scale FWMAV designs

Author: Adams S G
Antoniou A
Bolsman C T
Brenner M P
de Marneffe B
de Volder M
F van Keulen
Géradin M
H J Peters
Howell L L
J F L Goosen
Leland E S
Peters C
Peters H J
Peters H J
Rao S S
Sreetharan P S
Stanway R
Tai-Ran H
Wang Q
Weis Fogh T
Publication venue: 'IOP Publishing'
Publication date
Field of study